Insights into the prediction of the liquid density of refrigerant systems by artificial intelligent approaches

This study presents a novel model for accurately estimating the densities of 48 refrigerant systems, categorized into five groups: Hydrofluoroethers (HFEs), Hydrochlorofluorocarbons (HCFCs), Perfluoroalkylalkanes (PFAAs), Hydrofluorocarbons (HFCs), and Perfluoroalkanes (PFAs). Input variables, including pressure, temperature, molecular weight, and structural groups, were systematically considered. The study explores the efficacy of both the multilayer perceptron artificial neural network (MLP-ANN) and adaptive neuro-fuzzy inference system (ANFIS) methodologies in constructing a precise model. Utilizing a comprehensive dataset of 3825 liquid density measurements and outlier analysis, the models achieved R2 and MSE values of 0.975 & 0.5575 and 0.967 & 0.7337 for MLP-ANN and ANFIS, respectively, highlighting their remarkable predictive performance. In conclusion, the ANFIS model is proposed as an effective tool for estimating refrigerant system densities, particularly advantageous in scenarios where experimental measurements are resource-intensive or sophisticated analysis is required.


ANFIS
Primarily, Zadeh introduced the notion of Fuzzy Logic (FL), which had the capability to arrange outputs on a spectrum from completely false to completely true.In contrast, classical logic is only able to arrange conclusions as either true or false 65 .Consociate of linguistic rules of if-then and principals of fuzzy logic developed the model.Using the basics of fuzzy logic helps us to touch a transparent development process and outputs with more accuracy.Coupling ANN and fuzzy logic make it possible to get precise solutions for extraordinarily non-linear systems 66,67 .ANFIS is created of integrating from fuzzy logic and ANN.Mamdani and Takagi-Sugeno are two structures for FIS [68][69][70] .Logical explanation in progress of fuzzy if-then rules used in the first FIS type, but the next type of FIS generates the if-then rules rely on the performance of accessible empirical data.The Takagi-Sugeno type inference system used in the ANFIS method to demonstrate the non-linear reliance of variables 70 .An if-then rule is applied in a generic ANFIS structure for Y 1 and Y 2 (input parameters) as follow (Eqs. 4-7): In these equations, g shows the output parameters and respectively C i andD i (i = 1, 2) are fuzzy sets for Y 1 andY 2 .Generally, this structure has 5 layers.The initial layer for fuzzification utilizes the membership function to convert input data into linguistic terms.In this investigation, the GM is employed and is defined as follows: In Eq. ( 8), Q represents the center of the Gaussian distribution, 2 refers to the variance and O is the output of the layer.For getting the most accurate model, GM should be optimized.In the second layer, by computing the commonly referred to firing strength parameters, it becomes possible to assess the dependability of the preceding components (Eq.9): Moreover, in the third layer, the normalization of estimated firing strengths has been carried out (Eq. 10): In Eq. (11) for output parameter the linguistic terms are defined (fourth layer): where q , r s and p i are linear parameters for optimization.Eventually, in the fifth layer all of the rules associated to an output will be appear in the following formula (Eq.12): (1) h y = ySPS : id : : ǫ1 (3) k y = e y − e −y e y + e −y SPS : id : : ǫ3

Preprocessing procedure
This paper shows three strategies such as ANFIS and MLP-ANN are applied to estimate density rely on molecular mass (Mw), pressure (P), the structural groups and temperature (T).The structural groups used as input parameters are shown in Table 1.The computational tools and platforms employed for model development and evaluation were crucial components of our methodology.Specifically, MATLAB (version 2020) served as the primary computational environment for implementing and assessing the ANFIS and MLP-ANN strategies applied to estimate density.The choice of MATLAB was driven by its versatility, extensive toolboxes for neural network implementation, and widespread use in scientific research.Furthermore, the dataset employed for modeling purposes, consisting of 3825 data points, is meticulously sourced and extensively referenced throughout the manuscript 16,[71][72][73][74][75][76][77][78][79][80][81][82][83][84][85][86] .These data points, carefully selected from reputable and relevant literature, form the foundation of our study and contribute to the robustness and validity of our modeling approach.Testing and training subsets got from collected data.25% of data are applied for testing, and 75% of data are occupied for training the recommended models.Normalization of data was performed by Eq. ( 13) 87 : where the is the parameter value.Norm, max, and min stand for the normalized, maximum, and minimum values, respectively.The normalized data spans from − 1 to 1. Density is the output of the model, and the input variables are the other four parameters that were mentioned earlier.

MLP-ANN
Following parameter shows the general output parameter for this model (Eq.14): In Eq. ( 14), W i, , W i,3 and n are respectively, the weight vector for neurons and for output layer neurons, and the number of hidden layer neurons.Also, b 3 is the bias term.
Additionally, decreasing the differences between real and estimated data gives us the optimum output parameter (using the ANN structure).The minimization occurs with regulating weight and bias parameters.In this task, we utilize the error function determined in Eq. ( 15): The notation r j i represents the ith actual output for the jth data point, while o j,l i denotes the output of the ith neuron in the first layer, where j is the data point index in the training dataset.Applying the Levenberg-Marquardt algorithm causes optimization.Moreover, Fig. 1 shows the performance of the utilized network relies on the MSE calculated data by using MLP-ANN.

ANFIS
Figure 2 shows the diagram of a generic ANFIS includes two variables as input.Training the proposed ANFIS takes place by utilizing a genetic algorithm (GA).Equation ( 16) determines the whole parameters of this model that depends on the number of variables ( N v ), the number of clusters ( N c ) and the number of MF parameters ( N MF ): In this manuscript, the MF utilized is the GM function.Zandσ 2 are the MF parameters.Pressure (P), tem- perature (T), molecular mass (Mw), and the structural groups are input variables.So, for 480 ANFIS parameters, the total number was obtained.For the GA algorithm used in achieving the optimum parameters of this model, www.nature.com/scientificreports/ the cost function is the RMSE between the real and estimated data. Figure 3 denotes the RMSE values of each iteration.

Models' evaluation
For attaining to the precision of the predictive model's, root mean squared error (RMSE), coefficient of determination (R 2 ), mean squared error (MSE), standard deviation (STD) and AARD are acceptable statistical criteria's.
Following equations are the mathematical definition of mentioned criteria (Eqs.17

Results and discussion
The presented strategies were hired to determine the density of various refrigerant systems by considering pressure (P), temperature (T), the structural groups, and molecular mass (Mw) as input parameters.More information about these intelligent models is brought in Table 2. Using the MLP-ANN model, density values are better estimated than other model (ANFIS).This fact has been proven by the statistical analysis given in the evaluation section of the models.
To assess the efficacy of the models employed in this study, we employed various graphical methods.Figure 4 illustrates the plots comparing experimental and predicted density values for each model.Notably, the MLP-ANN model emerges as the most precise performer in density prediction, demonstrating a superior alignment between predicted and actual values.This graphical representation provides a clear visual insight into the predictive accuracy of the models, with the MLP-ANN model showcasing particularly commendable performance.
Regression plots between experimental versus predicted density values are shown in Fig. 5.The best fitting lines are obtained by using linear regression between the real and estimated values (Fig. 5a,b).
The relative deviations of the estimated and real data are shown in Fig. 6.It is apparent that the least deviation is related to the MLP-ANN model.
As a matter of fact, this is because of the accumulation of data points around the zero line.MRE values for MLP-ANN and ANFIS are 4.751 and 5.068, respectively.Additionally, for understanding the capability of these models in the prediction of the density values the statistical error analyses are done.These analyses are brought in Table 3.

Outlier detection
The precision of the models put forward is significantly impacted by the actual data employed in the segment dedicated to model development 89 .In order to ensure the robustness of our models, locating and removing a set of data points exhibiting distinct behavior from the rest of the dataset, referred to as outliers, is regarded as a crucial step in enhancing the reliability of models 90 .The leverage analysis is used in addition to standardized residuals implementation to determine potential outliers.By plotting standardized residuals (R) versus hat values (H), William's plot, outliers are detected.This multifaceted approach allows us to thoroughly assess the data points that might disproportionately influence the model outcomes.Equation ( 22) is used to calculate diagonal   Components of the hat matrix, which are expressed as hat values and are used in the identification of feasible/ suitable regions.
Taking into account n as the number of data points and k as the number of input parameters, X represents a (n × k) matrix.This matrix is instrumental in evaluating the influence of each data point on the model.Warning leverage and cut-off values on the horizontal and vertical axis make a squared area called the feasible region.

Below equation gives the warning leverage:
This calculated warning leverage is pivotal in setting thresholds to identify potential outliers.Typically, the threshold value for R is deemed to be 3. Values beyond the boundaries of the feasible region are treated as outlier data.By meticulously considering these calculated parameters, our approach offers a comprehensive method for identifying and addressing outliers in the dataset.While the presented models, MLP-ANN and ANFIS, exhibit remarkable predictive performance in estimating the densities of various refrigerant systems, it is essential to acknowledge certain limitations in the current study.One limitation is the reliance on a specific dataset comprising 3825 data points.Although the dataset is meticulously sourced from reputable literature, its scope may not cover all possible scenarios and variations in refrigerant properties.Additionally, the models are developed based on the selected input parameters, including molecular mass, pressure, structural groups, and temperature.The exclusion of certain relevant parameters or the consideration of additional factors could potentially impact the models' generalizability to a broader range of refrigerant systems.Furthermore, the outlier analysis conducted in this study identified specific data points that deviate from the predicted trends.While these outliers were carefully addressed, their presence underscores the sensitivity of the models to anomalous data.Future research endeavors could explore ways to enhance the robustness of the models by incorporating more diverse datasets, exploring additional input parameters, and implementing advanced outlier detection techniques.Moreover, continuous refinement of the models through ongoing validation against experimental data will contribute to their reliability in practical applications.

Conclusions
In summary, the prediction of densities for 48 refrigerant systems was facilitated through the utilization of two intelligent models, incorporating crucial parameters such as molecular mass (Mw), pressure (P), structural groups, and temperature (T).The superior performance of the MLP-ANN approach over the ANFIS model, demonstrated by consistently lower error values, has been highlighted by the findings.Transcending its theoretical significance, the research harbors practical implications of particular relevance to scientists and engineers engaged in the design of economically viable low-temperature refrigeration cycles.The accuracy demonstrated by the MLP-ANN model establishes it as a valuable tool, one that effectively guides the optimization of system performance and steers the development of energy-efficient and cost-effective refrigeration technologies.These implications, indicative of the broader impact, underscore the study's contribution to advancements in the field of refrigeration system design.The findings not only deepen the understanding of the intricacies involved but also actively contribute to the evolution of methodologies, offering insights that shape the trajectory of progress in refrigeration technology.

Figure 1 .
Figure 1.MLP-ANN's performance based on Mean Squared Error across various iterations of the LM algorithm.

2 Figure 3 .
Figure 3. Performance of ANFIS during the training stage employing the Particle Swarm Optimization approach.

Figure 4 .
Figure 4. Comparison between estimated density values and experimental data using different models: (a) ANFIS, and (b) MLP-ANN.

Figure 7 shows
William's plot.According to this Figure, 27 and 17 points of the ANFIS and MLP-ANN approaches are placed outside of the feasible region.

Figure 5 .
Figure 5. Regression diagram to estimate density using different models in the training and testing steps; (a) ANFIS, (b) MLP-ANN.

Figure 6 .
Figure 6.Percentage relative deviation of testing and training data with various models: (a) ANFIS, and (b) MLP-ANN.

Table 1 .
Categories of structural groups examined in MLP-ANN and ANFIS models.

Table 2 .
Additional information about models trained for density estimation.

Table 3 .
Assessing the effectiveness of the proposed models through statistical analysis.